Statistical Analyses of Named Entity Disambiguation Benchmarks
نویسندگان
چکیده
In the last years, various tools for automatic semantic annotation of textual information have emerged. The main challenge of all approaches is to solve ambiguity of natural language and assign unique semantic entities according to the present context. To compare the different approaches a ground truth namely an annotated benchmark is essential. But, besides the actual disambiguation approach the achieved evaluation results are also dependent on the characteristics of the benchmark dataset and the expressiveness of the dictionary applied to determine entity candidates. This paper presents statistical analyses and mapping experiments on different benchmarks and dictionaries to identify characteristics and structure of the respective datasets.
منابع مشابه
Automatic Generation of Benchmarks for Entity Recognition and Linking
The velocity dimension of Big Data plays an increasingly important role in processing unstructured data. Heretofore, no large-scale benchmarks were available to evaluate the performance of named entity recognition and entity linking solutions. This unavailability was due to the creation of gold standards for named entity recognition and entity linking being a time-intensive, costly and error-pr...
متن کاملRobust Named Entity Disambiguation with Random Walks
Named Entity Disambiguation is the task of assigning entities from a Knowledge Graph (KG) to mentions of such entities in a textual document. The state-of-the-art for this task balances two disparate sources of similarity: lexical, defined as the pairwise similarity between mentions in the text and names of entities in the KG; and semantic, defined through some graph-theoretic property of a sub...
متن کاملSIR-NERD: A Chinese Named Entity Recognition and Disambiguation System using a Two-Stage Method
This paper presents our SIR-NERD system for the Chinese named entity recognition and disambiguation Task in the CIPS-SIGHAN joint conference on Chinese language processing (CLP2012). Our system uses a two-stage method and some key techniques to deal with the named entity recognition and disambiguation (NERD) task. Experimental results on the test data shows that the proposed system, which incor...
متن کاملA Joint Named Entity Recognition and Entity Linking System
We present a joint system for named entity recognition (NER) and entity linking (EL), allowing for named entities mentions extracted from textual data to be matched to uniquely identifiable entities. Our approach relies on combined NER modules which transfer the disambiguation step to the EL component, where referential knowledge about entities can be used to select a correct entity reading. Hy...
متن کاملCombining Textual and Graph-Based Features for Named Entity Disambiguation Using Undirected Probabilistic Graphical Models
Named Entity Disambiguation (NED) is the task of disambiguating named entities in a natural language text by linking them to their corresponding entities in a knowledge base such as DBpedia, which are already recognized. It is an important step in transforming unstructured text into structured knowledge. Previous work on this task has proven a strong impact of graph-based methods such as PageRa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013